home *** CD-ROM | disk | FTP | other *** search
Text File | 1996-06-07 | 27.1 KB | 944 lines | [TEXT/R*ch] |
- /*
- 8 April 1996.
- Submission to MacTech Programmer's challenge for April-96.
- Copyright 1996, Ernst Munter, Kanata, ON, Canada.
-
- "Mutant Life"
-
- | Revisions 7 April 1996:
- | In "PropagateLife", added check to return correct
- | generation count with an empty map, and no spontaneous
- | birth.
- | Added #define LARGE_BITMAPS to allow trading of 20% space
- | for 3% loss in speed with large maps
-
- | Revision 8 April 1996:
- | In "PropagateLife", reduced number of iterations of main
- | loop by one, and added special exit code.
- | This solves the problem with the empty map; but it also
- | reduces the time for all runs (40% saving on a single
- | generation run).
-
- Problem Statement
- -----------------
- The challenge is to write a fast program that computes the
- Life-like world some generations into the future, Inter-
- mediate generations need not necessarily be computed, but
- to me at least, it seems we must process one generation at
- a time.
-
- The birth and death rules are specified as bit maps which
- determine changes as a function of any combination of the
- number of cell neighbors (0 to 8), giving rise to 2**18
- possible rule sets.
-
- Another requirement is that the propagation function stop
- as soon as two consecutive worlds are identical.
-
- Additional memory is limited to 1MB plus up to 10 times
- the size of the original bitmap.
-
- Solution Strategy
- -----------------
- We copy the starting bitmap to an internal representation;
- process this for the required number of generations, or
- until we see no change; and then copy the final state back
- to the original bitmap location.
-
- The internal cell format includes the state as well as the
- neighbor count of each cell.
-
- The propagation process consists of two phases for each
- generation: phase 1 in which all changes are used to
- update the neighbor counts, and phase 2 in which the
- neighbor counts are re-evaluated, together with the
- current state, to determine the change to the next state.
-
- Since usually, not all of the cell map changes, it is
- only necessary to process those cells, and their immediate
- neighbors, that do change.
-
- Data Structure
- --------------
- Because of memory limitation, and in the interest of speed
- blocks of 32 cells are stored and processed together.
-
- A "CellBlock" of 32 cells is a 32-byte structure which
- contains cell states, cell changes, counters, and pointers
- to allow cell blocks to be chained into linked lists.
-
- A number of flag bits are stored with each block to
- allow marking of border blocks which require special
- computation to determine the neighbor cells (wrapping).
-
- The counts are stored modulo-9 arithmetic, A 13 bit
- counter holds the neighbor counts for 4 cells, and there
- are 8 such counters located in each cell block.
-
- CellBlocks are allocated dynamically.
-
- Computation and Tables
- ----------------------
- We must perform 2 kinds of computation:
-
- Count Updates
- -------------
- To update the neighbor counts we determine the addresses
- of the neighbors and increment or decrement each modulo-9
- counter depending on the change of state to 0 or 1,
-
- The counter update function is fixed (not dependent on
- birth or death rules). The changes in a row of 10 cells
- affect exactly 8 counts in the same row and the blocks
- above and below. Hence a lookup table with 1024 entries
- can be used to provide the pre-computed sums of power-9
- products with 1 or 0, for two of the 13-bit counters
- which are conveniently stored as a pair in a single word.
-
- State update and its rules
- --------------------------
- To determine the state change of a cell we need consider
- only the state of the cell, the neighbor count, and the
- current set of birth/death rules.
-
- Given that 4 counts are stored, munged together in a
- single 13-bit modulo-9 counter, we can also use a look-up
- table for this function. The rules table contains 9*9*9*9
- entries of 1 byte each, where each byte contains a 4-bit
- birth field, and a 4-bit death field, simply derived from
- the bits in the given birth/deathRules selected according
- to the four neighbor counts implied by the table address.
-
- This rules table must be computed when PropagateLife is
- called, to reflect the current rules.
-
- Assuming that PropagateLife might be called frequently
- in succession, often with the same rules, we can avoid
- re-computing this table unnecessarily.
-
- And since we can use a "reasonable" amount of static
- memory (up to 1MB, say), we can store a fairly big
- rule book of many different rules. This would allow
- alternating rules to be handled efficiently as well.
-
- Hence, all left over available (permitted) memory is used
- for the rule book. The 18-bits of concatenated birth and
- death rules are hashed into a page index to quickly locate
- a previously computed rules page.
-
- Other Assumptions
- -----------------
- cells is a bitmap where
- cells.bounds.top=cells.bounds.left==0
- cells.bounds.bottom - cells.bounds.top >= 3
- cells.bounds.right - cells.bounds.left >= 3
-
- The maximum static memory limit can be #defined,
- minimum 15K.
-
- The function allocates dynamic memory equivalent to
- 8 more bitmaps + 32 bytes. It returns 0 if malloc fails.
-
- "Spontaneous birth" is allowed, (birthRules LSB set).
- */
-
- /***** My header file, please adapt as needed **********/
-
- #include <stdlib.h>
- #include <stdio.h>
- #include <memory.h>
-
- struct Rect {
- short top;
- short left;
- short bottom;
- short right;
- };
-
- struct BitMap {
- void* baseAddr;
- short rowBytes;
- Rect bounds;
- };
-
- #ifdef __BORLANDC__
- long pascal PropagateLife(
- #else
- pascal long PropagateLife(
- #endif
- BitMap cells,
- long numGenerations,
- short birthRules,
- short deathRules);
-
- /******************** End of header *******************/
-
- // Define amount of static memory to be available,
- // minimum 15K:
- // the value of 1MB permits 158 rule pages to be stored
-
- #define availableStatic 0x100000L
-
- // You can reduce the size of the program by about 20% by
- // setting LARGE_BITMAPS to 0.
- // But runtime for large worlds (1000x1000) goes up 3-4%.
- // Normally, we would go with the shorter program, but in
- // this challenge, 3% might make a lot of difference:
-
- #define LARGE_BITMAPS 1
-
- /************ Type Definitions *************************/
-
- #define ulong unsigned long
- #define ushort unsigned short
- #define uchar unsigned char
-
- struct CellBlock {
- ulong state;
- ulong change;
- CellBlock* link1;
- CellBlock* link2;
- long count[4];
- };
- typedef struct CellBlock CellBlock;
-
- struct RulesPage {
- int birthRule;
- int deathRule;
- uchar rules[81*81];
- char pad[3];
- };
- typedef struct RulesPage RulesPage;
-
- struct Bits2Counts {
- long UD;
- long LR;
- };
- typedef struct Bits2Counts Bits2Counts;
-
- /*************** Function Prototypes *******************/
-
- uchar* MakeRuleTable(
- int b,
- int d);
-
- CellBlock* LoadCells(
- BitMap cells,
- int width,
- int padWords,
- CellBlock* cb,
- CellBlock* L1);
-
- CellBlock* LoadCellsSpontaneous(
- BitMap cells,
- int width,
- int padWords,
- CellBlock* cb,
- CellBlock* L1);
-
- CellBlock* UpdateCounts(
- CellBlock* cb,
- CellBlock* L2,
- int width,
- int size,
- int lastColumn,
- long rightMask);
-
- #if LARGE_BITMAPS
- CellBlock* UpdateCountsNoWrap(
- CellBlock* cb,
- CellBlock* L2,
- int width);
- #endif
-
- ulong ComputeStateChange(
- CellBlock* cb,
- uchar* rule);
-
- void UnloadCells(
- CellBlock* cb,
- int size,
- void* baseAddr);
-
- void UnloadPaddedCells(
- BitMap cells,
- int width,
- int padWords,
- CellBlock* cb);
-
- /******************** Static Data **********************/
-
- #define bcUD(i) ( \
- ((((i)>>5)&1)*729L)+ \
- ((((i)>>4)&1)*810L)+ \
- ((((i)>>3)&1)*819L)+ \
- ((((i)>>2)&1)*91L)+ \
- ((((i)>>1)&1)*10L)+ \
- ((i)&1) \
- )
-
- #define bcLR(i) ( \
- ((((i)>>5)&1)*729L)+ \
- ((((i)>>4)&1)*81L)+ \
- ((((i)>>3)&1)*738L)+ \
- ((((i)>>2)&1)*82L)+ \
- ((((i)>>1)&1)*9L)+ \
- ((i)&1) \
- )
-
- #define bc(i) { \
- (bcUD((i)>>4)<<13)+bcUD(i),(bcLR((i)>>4)<<13)+bcLR(i)}
-
- #define bc4(i) \
- bc(i),bc(i+1),bc(i+2),bc(i+3)
- #define bc16(i) \
- bc4(i),bc4(i+0x4),bc4(i+0x8),bc4(i+0xC)
- #define bc64(i) \
- bc16(i),bc16(i+0x10),bc16(i+0x20),bc16(i+0x30)
- #define bc256(i) \
- bc64(i),bc64(i+0x40),bc64(i+0x80),bc64(i+0xC0)
-
- static Bits2Counts BC[0x400] = {
- bc256(0),bc256(0x100),bc256(0x200),bc256(0x300)
- };
-
- #define numRulesPages \
- ((availableStatic-sizeof(BC)-16)/sizeof(RulesPage))
-
- static RulesPage rulesCache[numRulesPages];
-
- /********************** Constants **********************/
-
- #define MSB 0x80000000L
- #define LSB 0x00000001L
-
- #define UP 0x40000000L
- #define DOWN 0x20000000L
- #define LEFT 0x10000000L
- #define RIGHT 0x08000000L
-
- #define IS_U_BORDER(cb) (cb->count[0] & UP)
- #define IS_D_BORDER(cb) (cb->count[0] & DOWN)
- #define IS_L_BORDER(cb) (cb->count[0] & LEFT)
- #define IS_R_BORDER(cb) (cb->count[0] & RIGHT)
-
- /******************* Top level function ****************/
-
- #ifdef __BORLANDC__
- long pascal PropagateLife(
- #else
- pascal long PropagateLife(
- #endif
- BitMap cells,
- long numGenerations,
- short birthRules,
- short deathRules) {
-
- int width=(cells.bounds.right+31)>>5;
- int lastColumn=(cells.bounds.right-1) & 31;
- long rightMask=0xFFFFFFFF << (31-lastColumn);
- int size=width*cells.bounds.bottom;
- int padWords=(cells.rowBytes>>2)-width;
- long gen;
- long* allocMem;
- CellBlock* cb;
- CellBlock* L1;
- CellBlock* L2;
- uchar* rule;
-
- // Get memory for cell blocks, 1 block per 32 cells:
-
- if (0==(allocMem=(long*)malloc(32+size*sizeof(CellBlock))))
- return 0;
-
- // Align cell blocks with cache-line boundary (32-byte):
-
- { char* temp=(char*)allocMem;
- temp+=(32-(ulong)temp) & 31;
- cb=(CellBlock*)temp;
- }
-
- // Clear all cell blocks:
-
- memset(cb,0,size*sizeof(CellBlock));
-
- // Establish the rules look-up table in static memory:
-
- rule=MakeRuleTable(birthRules,deathRules);
-
- // Copy BitMap cells into CellBlock structures and link:
- // Usually, only non-empty cells are linked, but
- // if birthRules include a "spontaneous birth" bit,
- // all cells must be linked.
-
- if (birthRules & 1)
- L1=LoadCellsSpontaneous(cells,width,padWords,cb,cb-1);
- else L1=LoadCells(cells,width,padWords,cb,cb-1);
- L2=L1;
-
- for (gen=1;;gen++) {
-
- // Linked list 1 contains cell blocks with a state change.
- // The state changes are applied to the cell states.
- // Loop 1 traverses list 1 and updates the counts of all
- // affected cells in neighboring blocks and itself.
- // If any counts change, the block is linked into list 2,
- // unless it is already there.
-
- while (L1>=cb) {
- #if LARGE_BITMAPS
- if ((L1->count[0] & (UP | DOWN | LEFT | RIGHT)) == 0)
- L2=UpdateCountsNoWrap(L1,L2,width);
- else
- #endif
- L2=UpdateCounts(L1,L2,width,size,lastColumn,rightMask);
- L1=L1->link1;
- }
-
- // List 1 is now empty.
- // Loop 2 finds the cell changes in each block of list 2,
- // using the current states and the neighbor counts.
- // If any state changes, the block is put back into list 1.
-
- while (L2>=cb) {
- CellBlock* next=L2->link2;
- if (ComputeStateChange(L2,rule)) {
- L2->link1=L1;
- L1=L2;
- }
- L2->link2=0;
- L2=next;
- }
-
- // If list 1 is empty (no change) we can make a fast exit:
-
- if (L1<cb) {
- if (gen==1) goto quick_exit; // nothing changed!
- break;
- }
- if (gen>=numGenerations) break;
- }
-
- // Apply last changes to cells in list 1:
-
- while (L1>=cb) {
- L1->state ^= L1->change;
- L1=L1->link1;
- }
-
- // The final states of all cells in the private cell blocks
- // are copied back into the external BitMap storage.
- // A slightly slower function provides for the uncommon case
- // when there are extra unused words beyond the right bound.
-
- if (padWords==0) UnloadCells(cb,size,cells.baseAddr);
- else UnloadPaddedCells(cells,width,padWords,cb);
-
- // Return all allocated dynamic memory to the system:
- quick_exit:
- free(allocMem);
-
- return gen;
- }
-
- /***************** Macros used in local functions ******/
-
- // SETBLOCK is used in LoadCells(..)
- #define SETBLOCK(flag) \
- { ulong c32=*bm++; \
- cb->count[0]=flag; \
- if (c32) { \
- cb->change=c32; \
- cb->link1=L1; \
- cb->link2=L1; \
- L1=cb; \
- } \
- cb++; \
- }
-
- // SETBLOCKs is used in LoadCellsSpontaneous(..)
- #define SETBLOCKs(flag) \
- { ulong c32=*bm++; \
- cb->count[0]=flag; \
- cb->change=c32; \
- cb->link1=L1; \
- cb->link2=L1; \
- L1=cb; \
- cb++; \
- }
-
- // LINK is used in UpdateCounts(..)
- #define LINK(cell) \
- { if (((cell)->link2)==0) { \
- (cell)->link2=L2; \
- L2=cell; \
- } \
- }
-
- // CHANGE_COUNT is used in UpdateCounts(..)
- #define CHANGE_COUNT(cell,ctr,delta); \
- { long cnt=(cell)->count[ctr] | MSB; \
- (cell)->count[ctr]=cnt+(delta); \
- }
-
- // The following three macros are used in UpdateCounts(..)
- #define UPDATE_COMMON(ctr) \
- { delta=BC[newIndex].UD-BC[oldIndex].UD; \
- CHANGE_COUNT(cUP,ctr,delta); \
- CHANGE_COUNT(cDN,ctr,delta); \
- delta=BC[newIndex].LR-BC[oldIndex].LR; \
- CHANGE_COUNT(cb,ctr,delta); \
- }
-
- // C/C++ does not understand negative shifts, so
- // we make separate macros for left and right shifts;
- #define UPDATE_UD(shft,ctr) \
- { int oldIndex=(oldState >> shft) & 0x3FF; \
- int newIndex=(newState >> shft) & 0x3FF; \
- UPDATE_COMMON(ctr) \
- }
-
- #define UPDATE_UD_(shft,ctr) \
- { int oldIndex=(oldState << shft) & 0x3FF; \
- int newIndex=(newState << shft) & 0x3FF; \
- UPDATE_COMMON(ctr) \
- }
-
- /***************** Local Functions *********************/
-
- uchar* MakeRuleTable(int b,int d) {
-
- // Checks if a valid rules table exists for the current
- // rules, and computes a new table if necessary.
-
- int pageNumber=((b << 9) + d) % numRulesPages;
- RulesPage* rp=rulesCache+pageNumber;
- if ((rp->birthRule != b)
- || (rp->deathRule != d)) {
-
- int q,r,s,Q,R,S;
- int t0,t1,t2,t3,t4,t5,t6,t7,t8;
- unsigned char* t=rp->rules;
- rp->birthRule = b;
- rp->deathRule = d;
- b=b<<4;;
- t0=((d)&1) | ((b)&0x10);
- t1=((d>>1)&1) | ((b>>1)&0x10);
- t2=((d>>2)&1) | ((b>>2)&0x10);
- t3=((d>>3)&1) | ((b>>3)&0x10);
- t4=((d>>4)&1) | ((b>>4)&0x10);
- t5=((d>>5)&1) | ((b>>5)&0x10);
- t6=((d>>6)&1) | ((b>>6)&0x10);
- t7=((d>>7)&1) | ((b>>7)&0x10);
- t8=((d>>8)&1) | ((b>>8)&0x10);
- t[0]=(uchar)t0;
- t[1]=(uchar)t1;
- t[2]=(uchar)t2;
- t[3]=(uchar)t3;
- t[4]=(uchar)t4;
- t[5]=(uchar)t5;
- t[6]=(uchar)t6;
- t[7]=(uchar)t7;
- t[8]=(uchar)t8;
- t+=6561-9;
- for (s=8;s>=0;s--) {
- S=rp->rules[s]<<3;
- for (r=8;r>=0;r--) {
- R=S | (rp->rules[r]<<2);
- for (q=8;q>=0;q--) {
- Q=R | (rp->rules[q]<<1);
- t[0]=(uchar)(Q | t0);
- t[1]=(uchar)(Q | t1);
- t[2]=(uchar)(Q | t2);
- t[3]=(uchar)(Q | t3);
- t[4]=(uchar)(Q | t4);
- t[5]=(uchar)(Q | t5);
- t[6]=(uchar)(Q | t6);
- t[7]=(uchar)(Q | t7);
- t[8]=(uchar)(Q | t8);
- t-=9;
- }
- }
- }
- }
- return rp->rules;
- }
-
- CellBlock* LoadCells(
- BitMap cells,
- int width,
- int padWords,
- CellBlock* cb,
- CellBlock* L1) {
-
- // Copies the states of all cells in the bitmap into
- // the cell blocks, as state changes to trigger evaluation.
- // Sets border flags as needed.
- // Links all non-empty cell blocks in both lists.
- // The special case of "spontaneous birth" is handled by
- // a separate function LoadCellsSpontaneous(..)
-
- ulong* bm=(ulong*)cells.baseAddr;
- int x,y;
-
- if (width>1) {
-
- //top edge:
- SETBLOCK(UP+LEFT);
- for (x=1;x<width-1;x++) SETBLOCK(UP);
- SETBLOCK(UP+RIGHT);
- bm+=padWords;
-
- //middle rows:
- for (y=1;y<cells.bounds.bottom-1;y++) {
- SETBLOCK(LEFT);
- for (x=1;x<width-1;x++) SETBLOCK(0);
- SETBLOCK(RIGHT);
- bm+=padWords;
- }
-
- //bottom edge:
- SETBLOCK(DOWN+LEFT);
- for (x=1;x<width-1;x++) SETBLOCK(DOWN);
- SETBLOCK(DOWN+RIGHT);
- } else {
-
- //case of bit map <= 32 cells wide
- //top edge:
- SETBLOCK(UP+LEFT+RIGHT);
- bm+=padWords;
- //middle rows:
- for (y=1;y<cells.bounds.bottom-1;y++) {
- SETBLOCK(LEFT+RIGHT);
- bm+=padWords;
- }
- //bottom edge:
- SETBLOCK(DOWN+LEFT+RIGHT);
- }
- return L1;
- }
-
- CellBlock* UpdateCounts(
- CellBlock* cb,
- CellBlock* L2,
- int width,
- int size,
- int lastColumn,
- long rightMask) {
-
- // Processes one cell block: locates its neighbor blocks,
- // and updates their counters according to the state changes
- // of the cells in the center block.
-
- // The 32 cells of a block are divided into 4 overlapping
- // groups, corresponding to the 4 counter words affected.
- // Only counters whose count changes are accessed, and so
- // flagged (by setting MSB).
-
- // Notably, right and left neighboring cell blocks are only
- // affected if the first or last bits of the center block
- // change.
-
- // The function deals with a number of special cases such
- // as wrapping, and the possibly partially filled block at
- // the right margin of the bitmap.
-
- ulong oldState=cb->state;
- ulong theChange=cb->change;
- ulong newState;
- ulong lastBit;
- CellBlock* cUP=cb-width;
- CellBlock* cDN=cb+width;
- long delta;
-
- if (IS_U_BORDER(cb)) cUP+=size;
- if (IS_D_BORDER(cb)) cDN-=size;
- if (IS_R_BORDER(cb)) {
- theChange &= rightMask;
- lastBit=0x80000000>>lastColumn;
- } else lastBit=LSB;
-
- newState=oldState ^ theChange;
- cb->state=newState;
- if (theChange & 0xFF800000) {
- UPDATE_UD(23,0);
- if (theChange & MSB) { //carry left
- int ctr=3;
- int offset=-1;
- delta=((newState >>30) & 2) - 1;
- if (IS_L_BORDER(cb)) { //wrap
- ctr = lastColumn >> 3;
- offset+=width;
- switch (lastColumn & 7) {
- case 0:delta = (delta<<13)*729; break;
- case 1:delta = (delta<<13)*81; break;
- case 2:delta = (delta<<13)*9; break;
- case 3:delta = (delta<<13); break;
- case 4:delta *= 729; break;
- case 5:delta *= 81; break;
- case 6:delta *= 9; break;
- }
- }
- CHANGE_COUNT(cUP+offset,ctr,delta);
- LINK(cUP+offset);
- CHANGE_COUNT(cb+offset,ctr,delta);
- LINK(cb+offset);
- CHANGE_COUNT(cDN+offset,ctr,delta);
- LINK(cDN+offset);
- }
- }
- if (theChange & 0x01FF8000) UPDATE_UD(15,1);
- if (theChange & 0x0001FF80) UPDATE_UD( 7,2);
- if (theChange & 0x000001FF) UPDATE_UD_(1,3);
- if (theChange & lastBit) { //carry right
- int offset=+1;
- if (IS_R_BORDER(cb)) { //wrap
- delta = (((newState >> (31-lastColumn)) << 1) & 2) -1;
- offset -= width;
- } else delta = (((newState << 1) & 2) - 1);
- delta *= (729L<<13);
- CHANGE_COUNT(cUP+offset,0,delta);
- LINK(cUP+offset);
- CHANGE_COUNT(cb+offset,0,delta);
- LINK(cb+offset);
- CHANGE_COUNT(cDN+offset,0,delta);
- LINK(cDN+offset);
- }
-
- LINK(cUP);
- LINK(cDN);
- LINK(cb);
- return L2;
- }
-
- #if LARGE_BITMAPS
- CellBlock* UpdateCountsNoWrap(
- CellBlock* cb,
- CellBlock* L2,
- int width) {
-
- // A stream-lined version of UpdateCounts(..)
- // for interior (non-border) cell blocks which cannot wrap.
-
- ulong oldState=cb->state;
- ulong theChange=cb->change;
- ulong newState;
- CellBlock* cUP=cb-width;
- CellBlock* cDN=cb+width;
- long delta;
-
- newState=oldState ^ theChange;
- cb->state=newState;
- if (theChange & 0xFF800000) {
- UPDATE_UD(23,0);
- if (theChange & MSB) { //carry left
- int ctr=3;
- int offset=-1;
- delta=((newState >>30) & 2) - 1;
- CHANGE_COUNT(cUP+offset,ctr,delta);
- LINK(cUP+offset);
- CHANGE_COUNT(cb+offset,ctr,delta);
- LINK(cb+offset);
- CHANGE_COUNT(cDN+offset,ctr,delta);
- LINK(cDN+offset);
- }
- }
- if (theChange & 0x01FF8000) UPDATE_UD(15,1);
- if (theChange & 0x0001FF80) UPDATE_UD( 7,2);
- if (theChange & 0x000001FF) UPDATE_UD_(1,3);
- if (theChange & LSB) { //carry right
- int offset=+1;
- delta = (((newState << 1) & 2) - 1) * (729L<<13);
- CHANGE_COUNT(cUP+offset,0,delta);
- LINK(cUP+offset);
- CHANGE_COUNT(cb+offset,0,delta);
- LINK(cb+offset);
- CHANGE_COUNT(cDN+offset,0,delta);
- LINK(cDN+offset);
- }
-
- LINK(cUP);
- LINK(cDN);
- LINK(cb);
- return L2;
- }
- #endif
-
- ulong ComputeStateChange(CellBlock* cb,uchar* rule) {
-
- // Computes the state change for 32 cells in a cell block.
-
- // The 4 counters are flagged, and only those counters
- // that actually changed need to be considered.
-
- // The algorithm computes 32-bit birth and death masks
- // from the counts: Each counter word cb->count[i]
- // contains two 13-bit counter values. These are
- // used successively to index into the rules table
- // each time extracting two 4-bit fields, for births
- // and deaths according to the 4 modulo-9 counts
- // inherent in the counters. The 4-bit fields from
- // all lookups (or 0) are separately stacked into 32-bit
- // birth and death masks.
-
- // Note: counters that are known not to have changed
- // (MSB clear) cannot effect a state change and are
- // skipped in the evaluation of counts.
-
- // In each bit position then, each state bit selects the
- // corresponding birth or death mask as follows:
-
- // A 0-state selects the birth bit, and a 1-state
- // selects the death bit.
-
- // oldState birth death change
- // 0 0 x 0
- // 0 1 x 1
- // 1 x 0 0
- // 1 x 1 1
-
- // This is conveniently expressed as (in Pascal):
- // change := (old AND death) OR (NOT old AND birth)
-
- // And we can perform this function on all 32 bits at once.
-
- // PowerPC has an instruction (rlwinm) that can & and >>
- // immediate, all in one clock cycle.
- // Let's hope the compiler knows how to use it here.
-
- ulong stateChange;
- ulong db4_hi,db4_lo,d32=0,b32=0;
- long cnt;
-
- if ((cnt=cb->count[0])<0) {
- cb->count[0] = cnt & (~MSB);
- db4_hi = rule[(cnt>>13) & 0x1FFF];//mask off flag bits
- db4_lo = rule[ cnt & 0x1FFF];
- d32 |= ((db4_hi & 0x0F) << 28)|((db4_lo & 0x0F) << 24);
- b32 |= ((db4_hi & 0xF0) << 24)|((db4_lo & 0xF0) << 20);
- }
- if ((cnt=cb->count[1])<0) {
- cnt &= (~MSB);
- cb->count[1] = cnt;
- db4_hi = rule[cnt >> 13];
- db4_lo = rule[cnt & 0x1FFF];
- d32 |= ((db4_hi & 0x0F) << 20)|((db4_lo & 0x0F) << 16);
- b32 |= ((db4_hi & 0xF0) << 16)|((db4_lo & 0xF0) << 12);
- }
- if ((cnt=cb->count[2])<0) {
- cnt &= (~MSB);
- cb->count[2] = cnt;
- db4_hi = rule[cnt >> 13];
- db4_lo = rule[cnt & 0x1FFF];
- d32 |= ((db4_hi & 0x0F) << 12)|((db4_lo & 0x0F) << 8);
- b32 |= ((db4_hi & 0xF0) << 8)|((db4_lo & 0xF0) << 4);
- }
- if ((cnt=cb->count[3])<0) {
- cnt &= (~MSB);
- cb->count[3] = cnt;
- db4_hi = rule[cnt >> 13];
- db4_lo = rule[cnt & 0x1FFF];
- d32 |= ((db4_hi & 0x0F) << 4)| (db4_lo & 0x0F);
- b32 |= (db4_hi & 0xF0) |((db4_lo & 0xF0) >> 4);
- }
-
- stateChange = (cb->state & d32)
- | (~(cb->state) & b32);
- cb->change=stateChange;
- return stateChange;
- }
-
- void UnloadCells(CellBlock* cb,int size,void* baseAddr) {
-
- // Simple sequential copy of all cell states from
- // cell blocks to bit map, loop unrolled once.
-
- ulong* bm=(ulong*)baseAddr;
- while (size>1) {
- size-=2;
- bm[0]=cb[0].state;
- bm[1]=cb[1].state;
- bm+=2;
- cb+=2;
- }
- if (size) *bm=cb->state;
- }
-
- //
- // The following 2 functions will handle the loading and
- // unloading of cell blocks for the less likely scenarioes.
- // Keeping these complications out of the mainstream functions
- // above is really only a tweak to not waste time on frequent
- // checks, noticeable only with very short runs.
-
- CellBlock* LoadCellsSpontaneous(
- BitMap cells,
- int width,
- int padWords,
- CellBlock* cb,
- CellBlock* L1) {
-
- // Same as LoadCells(..) above, but handles the special case
- // of "spontaneous birth" by linking all cells.
-
- ulong* bm=(ulong*)cells.baseAddr;
- int x,y;
-
- if (width>1) {
-
- //top edge:
- SETBLOCKs(UP+LEFT);
- for (x=1;x<width-1;x++) SETBLOCKs(UP);
- SETBLOCKs(UP+RIGHT);
- bm+=padWords;
-
- //middle rows:
- for (y=1;y<cells.bounds.bottom-1;y++) {
- SETBLOCKs(LEFT);
- for (x=1;x<width-1;x++) SETBLOCKs(0);
- SETBLOCKs(RIGHT);
- bm+=padWords;
- }
-
- //bottom edge:
- SETBLOCKs(DOWN+LEFT);
- for (x=1;x<width-1;x++) SETBLOCKs(DOWN);
- SETBLOCKs(DOWN+RIGHT);
- } else {
-
- //case of bit map <= 32 cells wide
- //top edge:
- SETBLOCKs(UP+LEFT+RIGHT);
- bm+=padWords;
- //middle rows:
- for (y=1;y<cells.bounds.bottom-1;y++) {
- SETBLOCKs(LEFT+RIGHT);
- bm+=padWords;
- }
- //bottom edge:
- SETBLOCKs(DOWN+LEFT+RIGHT);
- }
- return L1;
- }
-
- void UnloadPaddedCells(
- BitMap cells,
- int width,
- int padWords,
- CellBlock* cb){
-
- // Copy all cell states from cell blocks to bit map,
- // row by row, skipping unused pad words at the end of
- // each row in the bit map.
-
- int x,y;
- ulong* bm=(ulong*)(cells.baseAddr);
- for (y=0;y<cells.bounds.bottom;y++) {
- for (x=0;x<width;x++) {
- *bm++=cb->state;
- cb++;
- }
- bm+=padWords;
- }
- }
-